Sourcing the Crowd for a Few Good Ones: Event Type Detection

نویسندگان

  • Tommaso Caselli
  • Chu-Ren Huang
چکیده

This paper reports a crowdsourcing experiment on the identification and classification of event types in Italian. The data collected show that the task is not trivial (360 trusted judgments collected vs. 475 untrsuted ones) but it has been shown to be linguistically felicitous. The overall accuracy of the annotation is 61.6%. A reliability threshold assigned to the workers allows us to indentify the sub-population who has the awareness to perform this complex task and the accuracy of this sub-population is raised to 93%. Our hypothesis is that although the initial crowdsourced data is necessarily noisy, it can yield high quality results if the sub-population of ‘good’ workers can be identified. In other words, crowdsourcing offers a solution to difficult annotation tasks as long as there is an effective way to identify the reliable workers. TITLE AND ABSTRACT IN ANOTHER LANGUAGE, L2 (OPTIONAL, AND ON SAME PAGE) Identificare Annotatori Affidabili: Riconoscimento di Tipi di Evento Questo articolo descrive un esperimento di crowdsourcing per il riconoscimento e la classificazione dei tipi di evento in Italiano. I dati raccolti mostrano che il compito non è banale (360 giudizi affidabili vs. 475 giudizi non affidabili), ma dimostra di essere linguisticamente “felice”. L’accuratezza globale della annotazione è del 61,6%. Una soglia di affidabilità assegnata ai lavoratori ci permette di identificare la sotto-popolazione che ha la consapevolezza di svolgere questo compito complesso la cui accuratezza arriva fino al 93%. La nostra ipotesi è che, sebbene i dati iniziali ottenuti tramite tecniche di crowdsourcing siano necessariamente rumorosi, dei risultati di buona qualità possono essere ottenuti se la sotto-popolazione di "buoni" lavoratori è identificabile. In altre parole, il crowdsourcing offre una soluzione per compiti di annotazione difficili finché vi è un modo efficace per identificare i lavoratori affidabili.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Complex Event Processing Approach for Crisis-Management Systems

In modern advanced emergency management systems many solutions for decision support have been provided as attempts to support humans to take important decisions for the critical situations recovery. The critical situation detection is a complex procedure that involves both human and machine activities and leads to take a decision for the management and situation recovery. This paper presents an...

متن کامل

Crowd Behavior Recognition for Video Surveillance

Crowd behavior recognition is becoming an important research topic in video surveillance for public places. In this paper, we first discuss the crowd feature selection and extraction and propose a multiple-frame feature point detection and tracking based on the KLT tracker. We state that behavior modelling of crowd is usually coarse compared to that for individuals. Instead of developing genera...

متن کامل

Curriculum-guided Crowd Sourcing of Assessments in a Developing Country

Success of Wikipedia has opened a number of possibilities for crowd sourcing learning resources. However, not all crowd sourcing initiatives are successful. For developing countries, adoption factors like lack of infrastructure and poor teacher training can have an impact on success of such systems. This paper presents an exploratory study to determine if teachers in a developing country are ab...

متن کامل

Error-Correction and Aggregation in Crowd-Sourcing of Geopolitical Incident Information

A discriminative model is presented for crowd-sourcing the annotation of news stories to produce a structured dataset about incidents involving militarized disputes between nation-states. We used a question tree to gather partially redundant data from each crowd worker. A lattice of Bayesian Networks was then applied to error correct the individual worker annotations, the results of which were ...

متن کامل

Guess What? A Game for Affective Annotation of Video Using Crowd Sourcing

One of the most time consuming and laborious problems facing researchers in Affective Computing is annotation of data, particularly with the recent adoption of multimodal data. Other fields, such as Computer Vision, Language Processing and Information Retrieval have successfully used crowd sourcing (or human computation) games to label their data sets. Inspired by their work, we have developed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012